Demonstration of Index Techniques for Similarity-based Search in ORDBMSs

نویسندگان

  • Michael Peter Haustein
  • Wolfgang Mahnke
  • Norbert Ritter
چکیده

Today similarity-based search is used in numerous fields of applications like e-commerce, case-based reasoning, knowledge management, or text and image retrieval. To realize a similarity-based search in ORDBMSs, concepts and mechanisms are needed calculating the similarity of a comparison instance and the stored objects. Due to extremely high cost of function calls during query processing and the need to fetch all objects of the search space for calculating similarity values in order to rank the query results, it is essential to offer an index access for similarity-based queries to reduce response times. In this demonstration, we present local indices for symbolic, numeric, and string attributes, and show the calculation of similarity values for entire objects. These indices support a new way for direct calculation of similarity values also considering table structures without having to access the actual data objects. This can lead to enormous performance benefits.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

High-Performance Extensible Indexing

Today’s object-relational DBMSs (ORDBMSs) are designed to support novel application domains by providing an extensible architecture, supplemented by domain-specific database extensions supplied by external vendors. An important aspect of ORDBMSs is support for extensible indexing, which allows the core database server to be extended with external access methods (AMs). This paper describes a new...

متن کامل

The SH-Tree: A Novel and Flexible Super Hybrid Index Structure for Similarity Search on Multidimensional Data

Approaches to indexing and searching feature vectors are an indispensable factor to support similarity search effectively and efficiently. Such feature vectors extracted from real world objects are usually presented in the form of multidimensional data. As a result, many multidimensional data index techniques have been widely introduced to the research community. These index techniques are cate...

متن کامل

An improved opposition-based Crow Search Algorithm for Data Clustering

Data clustering is an ideal way of working with a huge amount of data and looking for a structure in the dataset. In other words, clustering is the classification of the same data; the similarity among the data in a cluster is maximum and the similarity among the data in the different clusters is minimal. The innovation of this paper is a clustering method based on the Crow Search Algorithm (CS...

متن کامل

ارزیابی خودکار جویش‌گرهای ویدئویی حوزه وب فارسی بر اساس تجمیع آرا

Today, the growth of the internet and its high influence in individuals’ life have caused many users to solve their daily needs by search engines and hence, the search engines need to be modified and continuously improved. Therefore, evaluating search engines to determine their performance is of paramount importance. In Iran, as well as other countries, extensive researches are being performed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003